Noise Robustness of Traditional Features for Macedonian Voice Dialing ASR

نویسندگان

  • Branislav Gerazov
  • Zoran Ivanovski
چکیده

Automatic Speech Recognition Systems of today are intensely deployed in real world application scenarios which are often characterized by suboptimal operating conditions. Thus their noise robustness has become a crucial parameter when assessing ASR in-field performance. The paper examines the noise robustness of traditional ASR feature sets as applied to a Voice Dialing Application built for Macedonian. The analysis focused on the following features: Linear Prediction Reflection Coefficients, Mel-Cepstral Cepstral Coefficients and Perceptual Linear Prediction Coefficients. The ASR system was trained with clean data, and in the evaluation phase the noise level in the test data was varied by adding white and babble noise. Results have been plotted for each feature type across varying SNR conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Techniques for robust speech recognition in the car environment

The use of voice commands or navigation features in the car is becoming a necessity. As keyboard and display interfaces cannot be used safely while driving, much effort has been done to make automatic speech recognition (ASR) and Text-to-Speech synthesis (TTS) ubiquitous features in the car. From voice dialing to car navigation, the requirements for voice technology vary greatly. While the use ...

متن کامل

Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots

Automatic Speech Recognition (ASR) which plays an important role in human-robot interaction should be noise-robust because robots are expected to work in noisy environments. Audio-Visual (AV) integration is one of the key ideas to improve the robustness in such environments. This paper proposes two-layered AV integration for ASR which applies AV integration to Voice Activity Detection (VAD) and...

متن کامل

Phone-duration-dependent long-term dynamic features for a stochastic model-based voice activity detection

Accurate voice activity detection (VAD) is important for robust automatic speech recognition (ASR) systems. This paper proposes noise-robust VAD using long-term temporal information in speech. Long-term temporal information has been an ASR focus recently, but has not been investigated sufficiently for VAD. This paper describes an attempt to incorporate long-term temporal information into a feat...

متن کامل

ASR systems in Noisy Environment: Analysis and Solutions for Increasing Noise Robustness

This paper deals with the analysis of Automatic Speech Recognition (ASR) suitable for usage within noisy environment and suggests optimum configuration under various noisy conditions. The behavior of standard parameterization techniques was analyzed from the viewpoint of robustness against background noise. It was done for Melfrequency cepstral coefficients (MFCC), Perceptual linear predictive ...

متن کامل

Evaluation of voice activity detection by combining multiple features with weight adaptation

For noise-robust automatic speech recognition (ASR), we propose a novel voice activity detection (VAD) method based on a combination of multiple features. The scheme uses a weighted combination of four conventional VAD features: amplitude level, zero crossing rate, spectral information, and Gaussian mixture model (GMM) likelihood. The weights for combination are adaptively updated using minimum...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013